class: center, middle, inverse, title-slide .title[ # Exploring the Advantages of R Admiral Package ] .author[ ### Shuang Gao ] .institute[ ### BeiGene ] --- ## Agenda - The three **W's** of admiral - **What** is admiral - **Where** is the package located - **Why** use admiral - Design of admiral - How to use admiral - Summary --- ## What is admiral - An **open-source** toolbox which enables the development of ADaM datasets in R. - **Modularised functions**, each with a standalone purpose. - Designed for **code readability**, ease of understanding and flexibility - Developed in the **collaboration** between following companies: <img src="img/admiralsystem.png" width="600" height="300" style="margin-left:200px;"> --- ## Where is the package located .pull-left[ <img src="img/admiral-homepage.png" width="600" height="350" > ] .pull-right[ - Find **installation** guidance on the Admiral website homepage. - See **user guides** for creating ADaMs with Admiral. - Also look at our therapeutic area extension packages **Admiralonco, Admiraloptha and Admiralvaccine**. .footnote[
Plug Circle Check
Find out more: [Admiral website](https://pharmaverse.github.io/admiral/cran-release/)
GitHub
Feedback: [Admiral GitHub](https://github.com/pharmaverse/admiral) ] ] --- ## Why use admiral .pull-left[ ####
Triangle Exclamation
Challenge
Viacoin
No **standard solutions** for ADaMs
Flag
A changing and **novel data landscape**
Gifts
New therapeutic areas and analysis concepts ] .pull-right[ <img src="img/asset.PNG" width="400" height="200" style="margin-right:150px;"> ####
Tumblr
**Modularized toolbox** >Modularized toolbox aims to create a toolset that is both flexible and powerful. ] --- ## Why use admiral ####
Gifts
**Good program design is the ability to easily understand what the program does by reading the code, not just how it does it.** .panelset.sideways[ .panel[.panel-name[*Clean Code*] <img src="img/goodvsbad.PNG" width="600" height="350"> ] .panel[.panel-name[*Concise*] _"The code should be small so that we can easily understand the intent of the function by just reading it."_ >"The first rule of functions is that they should be small. The second rule of functions is that they should be smaller than that." — Robert C. Martin, Clean Code: A Handbook of Agile Software Craftsmanshi ] .panel[.panel-name[*Single Responsibility*] _"Functions should do one thing. They should do it well. They should do it only."_ > Functions should have a single responsibility and not be overly complex. Breaking down tasks into smaller, single-purpose functions improves maintainability. ] ] --- ### Design of admiral - Ways to Think About Data .panelset[ .panel[.panel-name[Conceptual Diagram] <img src="img/Ways1.PNG" width="650" height="360" style="margin-left:150px;"> ] .panel[.panel-name[Analysts] <img src="img/Ways2.PNG" width="650" height="360" style="margin-left:150px;"> ] .panel[.panel-name[Programming languages] <img src="img/Ways3.PNG" width="650" height="360" style="margin-left:150px;"> ] ] --- ### Design of admiral - Data Manipulation <img src="img/dplyr.svg" width="800" height="450" style="margin-left:150px;"> --- ### Design of admiral - Data Manipulation <img src="img/join.svg" width="800" height="450" style="margin-left:100px;"> --- ### Design of admiral - Derive Family > Use Intention-Revealing Names: "The name of a variable, function, or class, should answer all the big questions. It should tell you why it exists, what it does, and how it is used." <img src="img/R-Derive.png" width="500" height="300" style="margin-left:300px;"> --- ### Design of admiral - Derive Family .pull-left[ <img src="img/R-Date_timming.png" width="500" height="400" style="margin-right:150px;"> ] .pull-right[ .panelset[ .panel[.panel-name[Usage] ```r derive_vars_dt( dataset, new_vars_prefix, dtc, highest_imputation = "n", date_imputation = "first", flag_imputation = "auto", min_dates = NULL, max_dates = NULL, preserve = FALSE ) ``` ] .panel[.panel-name[Input] ```r library(tibble) library(lubridate) mhdt <- tribble( ~MHSTDTC, "2019-07-18T15:25:40", "2019-07-18T15:25", "2019-07-18", "2019-02", "2019", "2019---07", "" ) ``` ] .panel[.panel-name[R Code] ```r # Create ASTDT and ASTDTF # No imputation for partial date derive_vars_dt( mhdt, new_vars_prefix = "AST", dtc = MHSTDTC ) ``` ] .panel[.panel-name[Output] ```r #> # A tibble: 7 × 2 #> MHSTDTC ASTDT #> <chr> <date> #> 1 "2019-07-18T15:25:40" 2019-07-18 #> 2 "2019-07-18T15:25" 2019-07-18 #> 3 "2019-07-18" 2019-07-18 #> 4 "2019-02" NA #> 5 "2019" NA #> 6 "2019---07" NA #> 7 "" NA ``` ] ] ] --- ### Design of admiral - Derive Family .pull-left[ <img src="img/R-ADSL.png" width="500" height="400" style="margin-right:150px;"> ] .pull-right[ .panelset[ .panel[.panel-name[Usage] ```r derive_vars_duration( dataset, new_var, new_var_unit = NULL, start_date, end_date, in_unit = "days", out_unit = "days", floor_in = TRUE, add_one = TRUE, trunc_out = FALSE ) ``` ] .panel[.panel-name[R code] ```r library(lubridate) library(tibble) data <- tribble( ~USUBJID, ~ASTDT, ~AENDT, "P01", ymd("2021-03-05"), ymd("2021-03-02"), "P02", ymd("2019-09-18"), ymd("2019-09-18"), "P03", ymd("1985-01-01"), NA, "P04", NA, NA ) derive_vars_duration(data, new_var = ADURN, new_var_unit = ADURU, start_date = ASTDT, end_date = AENDT, out_unit = "days" ) ``` ] .panel[.panel-name[Output] ```r #> # A tibble: 4 × 5 #> USUBJID ASTDT AENDT ADURN ADURU #> <chr> <date> <date> <dbl> <chr> #> 1 P01 2021-03-05 2021-03-02 -3 DAYS #> 2 P02 2019-09-18 2019-09-18 1 DAYS #> 3 P03 1985-01-01 NA NA NA #> 4 P04 NA NA NA NA ``` ] ] ] --- ### Design of admiral - Derive Family .pull-left[ <img src="img/R-BDS.png" width="500" height="400" style="margin-right:150px;"> ] .pull-right[ .panelset[ .panel[.panel-name[Usage] ```r derive_param_bsa( dataset, by_vars, method, set_values_to = exprs(PARAMCD = "BSA"), height_code = "HEIGHT", weight_code = "WEIGHT", get_unit_expr, filter = NULL ) ``` ] .panel[.panel-name[Input] ```r library(tibble) advs <- tribble( ~USUBJID, ~PARAMCD, ~PARAM, ~AVAL, ~VISIT, "01-701-1015", "HEIGHT", "Height (cm)", 170, "BASELINE", "01-701-1015", "WEIGHT", "Weight (kg)", 75, "BASELINE", "01-701-1015", "WEIGHT", "Weight (kg)", 78, "MONTH 1", "01-701-1015", "WEIGHT", "Weight (kg)", 80, "MONTH 2", "01-701-1028", "HEIGHT", "Height (cm)", 185, "BASELINE", "01-701-1028", "WEIGHT", "Weight (kg)", 90, "BASELINE", "01-701-1028", "WEIGHT", "Weight (kg)", 88, "MONTH 1", "01-701-1028", "WEIGHT", "Weight (kg)", 85, "MONTH 2", ) ``` ] .panel[.panel-name[R code] ```r derive_param_bsa( advs, by_vars = exprs(USUBJID, VISIT), method = "Mosteller", set_values_to = exprs( PARAMCD = "BSA", PARAM = "Body Surface Area (m^2)" ), get_unit_expr = extract_unit(PARAM) ) ``` ] .panel[.panel-name[Output] ```r #> # A tibble: 10 × 5 #> USUBJID PARAMCD PARAM AVAL VISIT #> <chr> <chr> <chr> <dbl> <chr> #> 1 01-701-1015 HEIGHT Height (cm) 170 BASELINE #> 2 01-701-1015 WEIGHT Weight (kg) 75 BASELINE #> 3 01-701-1015 WEIGHT Weight (kg) 78 MONTH 1 #> 4 01-701-1015 WEIGHT Weight (kg) 80 MONTH 2 #> 5 01-701-1028 HEIGHT Height (cm) 185 BASELINE #> 6 01-701-1028 WEIGHT Weight (kg) 90 BASELINE #> 7 01-701-1028 WEIGHT Weight (kg) 88 MONTH 1 #> 8 01-701-1028 WEIGHT Weight (kg) 85 MONTH 2 #> 9 01-701-1015 BSA Body Surface Area (m^2) 1.88 BASELINE #> 10 01-701-1028 BSA Body Surface Area (m^2) 2.15 BASELINE ``` ] ] ] --- class: inverse center middle # Data Manipulation of Multiple Dataframes --- ### Design of admiral - Data Manipulation .pull-left[ .panelset[ .panel[.panel-name[Usage] ```r derive_vars_merged( dataset, dataset_add, by_vars, order = NULL, new_vars = NULL, filter_add = NULL, mode = NULL, match_flag = NULL, missing_values = NULL, check_type = "warning", duplicate_msg = NULL ) ``` ] .panel[.panel-name[Input] ```r library(dplyr, warn.conflicts = FALSE) adsl <- tribble( ~USUBJID, ~SEX, ~COUNTRY, "ST42-1", "F", "AUT", "ST42-2", "M", "MWI", "ST42-3", "M", "NOR", "ST42-4", "F", "UGA" ) advs <- tribble( ~USUBJID, ~PARAMCD, ~AVISIT, ~AVISITN, ~AVAL, "ST42-1", "WEIGHT", "BASELINE", 0, 66, "ST42-1", "WEIGHT", "WEEK 2", 1, 68, "ST42-2", "WEIGHT", "BASELINE", 0, 88, "ST42-3", "WEIGHT", "WEEK 2", 1, 55, "ST42-3", "WEIGHT", "WEEK 4", 2, 50 ) ``` ] .panel[.panel-name[R code] ```r derive_vars_merged( adsl, dataset_add = advs, by_vars = exprs(USUBJID), new_vars = exprs( LSTVSCAT = if_else(AVISIT == "BASELINE", "BASELINE", "POST-BASELINE") ), order = exprs(AVISITN), mode = "last", missing_values = exprs(LSTVSCAT = "MISSING") ) ``` ] .panel[.panel-name[Output] ```r #> # A tibble: 4 × 4 #> USUBJID SEX COUNTRY LSTVSCAT #> <chr> <chr> <chr> <chr> #> 1 ST42-1 F AUT POST-BASELINE #> 2 ST42-2 M MWI BASELINE #> 3 ST42-3 M NOR POST-BASELINE #> 4 ST42-4 F UGA MISSING ``` ] ] ] .pull-right[ <img src="img/R-merge.png" width="400" height="300" style="margin-right:10px;"> ] --- ### Design of admiral - Data Manipulation .pull-left[ .panelset[ .panel[.panel-name[Usage] ```r derive_vars_joined( dataset, dataset_add, by_vars = NULL, order = NULL, new_vars = NULL, join_vars = NULL, filter_add = NULL, filter_join = NULL, mode = NULL, missing_values = NULL, check_type = "warning" ) ``` ] .panel[.panel-name[Input Data] ```r adae <- tribble( ~USUBJID, ~ASTDT, ~AESEQ, "1", "2020-02-02", 1, "1", "2020-02-04", 2 ) %>% mutate(ASTDT = ymd(ASTDT)) ex <- tribble( ~USUBJID, ~EXSDTC, "1", "2020-01-10", "1", "2020-01", "1", "2020-01-20", "1", "2020-02-03" ) ``` ] .panel[.panel-name[R code] ```r derive_vars_joined( adae, dataset_add = ex, by_vars = exprs(USUBJID), order = exprs(EXSDT = convert_dtc_to_dt(EXSDTC)), new_vars = exprs(LDRELD = compute_duration( start_date = EXSDT, end_date = ASTDT )), filter_add = !is.na(EXSDT), filter_join = EXSDT <= ASTDT, mode = "last" ) ``` ] .panel[.panel-name[Output] ```r #> # A tibble: 2 × 4 #> USUBJID ASTDT AESEQ LDRELD #> <chr> <date> <dbl> <dbl> #> 1 1 2020-02-02 1 14 #> 2 1 2020-02-04 2 2 ``` ] ] ] .pull-right[ <img src="img/R-join.png" > ] --- ### Design of admiral .pull-left[ #### **Multiple Sources** <img src="img/R-multiple.png" width="400" height="300" style="margin-right:50px;"> ] .pull-right[ #### **Higher Order** <img src="img/R-higher.png" width="400" height="300" style="margin-right:50px;"> ] --- ## How to use admiral .pull-left[ <img src="img/R-adsl_workflow.png" width="300" height="420" style="margin-left:100px;" > ] .pull-right[ .panelset[ .panel[.panel-name[Tab1] ```r ## Derive variables for first/last treatment date and time imputation flags adsl <- adsl %>% derive_vars_merged( dataset_add = ex_ext, filter_add = !is.na(EXSTDTM), new_vars = exprs(TRTSDTM = EXSTDTM, TRTSTMF = EXSTTMF), order = exprs(EXSTDTM, EXSEQ), mode = "first", by_vars = exprs(STUDYID, USUBJID) ) ``` ] .panel[.panel-name[Tab2] ```r # convert character date to numeric date without imputation ds_ext <- derive_vars_dt( ds, dtc = DSSTDTC, new_vars_prefix = "DSST" ) adsl <- adsl %>% derive_vars_merged( dataset_add = ds_ext, by_vars = exprs(STUDYID, USUBJID), new_vars = exprs(EOSDT = DSSTDT), filter_add = DSCAT == "DISPOSITION EVENT" & DSDECOD != "SCREEN FAILURE" ) ``` ] .panel[.panel-name[Tab3] ```r src_ae <- dthcaus_source( dataset_name = "ae", filter = AEOUT == "FATAL", date = convert_dtc_to_dtm(AESTDTC, highest_imputation = "M"), mode = "first", dthcaus = AEDECOD ) ``` ] .panel[.panel-name[Tab4] ```r adsl <- adsl %>% derive_var_extreme_dt( new_var = LSTALVDT, ae_start_date, ae_end_date, lb_date, trt_end_date, source_datasets = list(ae = ae, adsl = adsl, lb = lb), mode = "last" ) ``` ] .panel[.panel-name[Tab5] ```r adsl <- adsl %>% derive_var_merged_exist_flag( dataset_add = ex, by_vars = exprs(STUDYID, USUBJID), new_var = SAFFL, condition = (EXDOSE > 0 | (EXDOSE == 0 & str_detect(EXTRT, "PLACEBO"))) ) ``` ] ] ] --- ## Summary -
Triangle Exclamation
Admiral, as an **open-source project**, allows anyone to access the source code, learn, modify, and enhance it. This fosters transparency, education, and innovation, enabling different teams and individuals to collaborate in the effort to improve the tool. -
Code
Admiral package advocates the use of **simple, composable small functions** to build complex behaviors, which helps to improve the readability and maintainability of the code. -
Check
This project provides an entry point for **collaboration, co-creation, and contribution for everyone** in the pharmaceutical industry. This collaborative effort helps in achieving standardized and consistent methods for developing ADaMs across the entire industry. --- class: center, middle ## Acknowledgement #### Qiao Yan